Echo Chamber: A Game for Eliciting a Colloquial Paraphrase Corpus
نویسندگان
چکیده
The problem of semantic equivalence, or paraphrase, is a fundamental one for applications that “understand” natural language. Learned approaches to this problem face a lack of colloquial training data with which to build models. This paper describes, Echo Chamber, a game aimed at collecting sentential paraphrases from web users. Much of our current focus is designing a framework that makes this potentially burdensome task engaging and challenging. The game draws on elements of enduring pen-and-paper games such as Battleship and Hangman, and incorporates a time component to impart a sense of urgency. In the final version, it is intended that automated validation of input will ensure that the game is scalable and can collect high quality data without editorial intervention. In this paper, we discuss design considerations and issues emerging from the prototype of this game.
منابع مشابه
A contrastive review of paraphrase acquisition techniques
This paper addresses the issue of what approach should be used for building a corpus of sentential paraphrases depending on one’s requirements. Six strategies are studied: (1) multiple translations into a single language from another language; (2) multiple translations into a single language from different other languages; (3) multiple descriptions of short videos; (4) multiple subtitles for th...
متن کاملExtract Domain-specific Paraphrase from Monolingual Corpus for Automatic Evaluation of Machine Translation
Paraphrase can help match synonyms or match phrases with the same or similar meaning, thus it plays an important role in automatic evaluation of machine translation. The traditional approaches extract paraphrase in general domain from bilingual corpus. Because the WMT16 metrics task consists of three subtasks, namely news domain, medical domain, and IT domain, we propose to extract domainspecif...
متن کاملAutomatic conversion of colloquial Finnishto standard Finnish
This paper presents a rule-based method for converting between colloquial Finnish and standard Finnish. The method relies upon a small number of orthographical rules combined with a large language model of standard Finnish for ranking the possible conversions. Aside from this contribution, the paper also presents an evaluation corpus consisting of aligned sentences in colloquial Finnish, orthog...
متن کاملA Class-oriented Approach to Building a Paraphrase Corpus
Towards deep analysis of compositional classes of paraphrases, we have examined a class-oriented framework for collecting paraphrase examples, in which sentential paraphrases are collected for each paraphrase class separately by means of automatic candidate generation and manual judgement. Our preliminary experiments on building a paraphrase corpus have so far been producing promising results, ...
متن کاملTurkish Paraphrase Corpus
Paraphrases are alternative syntactic forms in the same language expressing the same semantic content. Speakers of all languages are inherently familiar with paraphrases at different levels of granularity (lexical, phrasal, and sentential). For quite some time, the concept of paraphrasing is getting a growing attention by the research community and its potential use in several natural language ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005